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Abstract 

We study autocorrelation times of physical observables in lattice QCD as a function 
of the molecular dynamics trajectory length in the hybrid Monte-Carlo algorithm. In 
an interval of trajectory lengths where energy and reversibility violations can be kept 
under control, we find a variation of the integrated autocorrelation times by a factor of 
about two in the quantities of interest. Trajectories longer than conventionally used are 
found to be superior both in the iVf = and Nf = 2 examples considered here. We also 
provide evidence that they lead to faster thermalization of systems with light quarks. 



1 Introduction 



Hybrid Monte-Carlo (HMC) "T. is the most widely used algorithm to simulate non- 
Abelian gauge theories in the presence of Nf > quark flavours. For Wilson-type quark 
actions it was found, in its most straightforward application, to become prohibitively 
expensive in the regime of phenomenological interest (with two very light quarks) P]- 
Recent advances IH El however have led to a substantial lowering of the CPU time 
of simulations. These more sophisticated forms of HMC typically have a number of 
parameters that have to be tuned to guarantee an efficient simulation. One parameter 
that is however common to all HMC algorithms is the length r of the trajectories of 
'molecular dynamics' evolution, at the end of which an accept/reject step is performed 
to correct for any finite-step-size errors. 

The algorithm efficiency's dependence on r has only been rarely and partially studied 
in the literature |71|H1IH1E1- The reason for this is easy to understand. If we neglect the 
cost of evaluating the Hamiltonian, performing trajectories of length r = 2 represents the 
same computational cost as twice as many trajectories of length 1: indeed, as discussed 
in the appendix, the average energy violations along a trajectory at fixed step-size 
are weakly dependent on the length within a reasonable range, and the reversibility 
violations increase rather slowly (see however ^2). Hence, with a high acceptance rate 
in both cases, to discriminate between these two running modes one has to determine the 
autocorrelation times of the observables of interest with some accuracy and reliability. 
This in turn requires very high statistics, which one is normally not ready to invest into 
parameter optimization. In this letter we study this question in various situations where 
high statistics can be achieved, and yet some realistic features of difficult simulations 
are present. 

Whether in the end our conclusions carry over to other forms of HMC algorithms 
will require direct testing. However we show that the dependence of the algorithm's 
efficiency on r can be significant, and a factor around two in computing time can easily 
be wasted in the case of an unfortunate choice of trajectory length. 

2 Data 

We employ the hybrid Monte-Carlo algorithm pQ. The Hamiltonian governing the evo- 
lution of the molecular dynamics is 

H=lJ2^{U^{x)U^{x)} + S[U]. (1) 

fl,X 

where the momenta H^ are traceless Hermitian matrices. The gauge fields are thus 
updated according to U'^{x) = exp[in^(a;)(ir]C/^(a;). In the N{ = 2 simulations, two 
pseudofermions are used to stochastically represent the determinant of the 0(a) im- 
proved Wilson fermions: the fermionic part of S results from even-odd- jllj and mass- 
preconditioning [1] the Hermitian Dirac operator. The gauge part of S is the Wilson 
plaquette action. 

When A^f = 2, we use the Sexton- Weingarten integration scheme |T3|, with a step- 
size four times larger for the fermionic forces; the quoted step-size dr always denotes 
the largest one. A complete update cycle with trajectory length r performs an integer 
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Table 1: The parameters of the physical systems considered. A boundary field (BF) 
means that it is vanishing, while A refers to 'point A' of Ref. jl6j . 

number r/dr of such steps followed by the Metropolis decision. The system has then ad- 
vanced by r molecular dynamics (MD) units, and in the rest of this paper all quantities 
referring to Monte Carlo time are given in MD units. If for instance successive mea- 
surements of observables are separated by M trajectories, an autocorrelation function 
r(i) arises where i refers to successive measurements. Prom it we define the integrated 
autocorrelation time, that is directly relevant for the statistical error, in our MD units 
as 



Tint = Mr 



^ + E.>iP(0 , p(i) = r(i)/r(o). 



In numerical estimates the sum has to be truncated. If we specify a window W (in MD 
units) this amounts to the restriction i < W/{Mt). We refer the reader to |E] for the 
definition of T{i), in particular for derived quantities, such as an effective mass. The 
error bars shown in plots of p are computed using Eq. (E.ll) of reference ^ . 

We investigate the trajectory length dependence of autocorrelation times in three 
different systems (A,B,C: see Tab.^. All simulations were done in 32-bit arithmetic, 
except for system C (64-bit). The system D will be used as a playground for thermal- 
ization. The lattice spacing is known in units of rg '17" at 1% level for the Nf = runs 
and 5% for the Nf = 2 runs. All lattices have Schrodinger functional boundary condi- 
tions. Since we will be using somewhat longer trajectories than is usual, the question 
of reversibility violations arises. We check for this in the standard way (see e.g. |71) 
by monitoring the Hamiltonian variation {\5H\^) under the following operations : a 
trajectory 'forward', reversing the sign of the momenta, and integrating back to the 
starting point. 

The observables we will focus on are those which typically have autocorrelation times 
much larger than that of the plaquette. One observable (defined already in the pure 
gauge theory) is dS/drj; the inverse of its expectation value defines the Schrodinger 
functional renormalized coupling constant ^S]- The others are fermionic: the corre- 
lators fAixo) and /p(xo) correspond to the propagation of a quark and an antiquark 
from a boundary to a point in the bulk of the lattice where the axial current or pseu- 
doscalar density annihilates them Finally we also consider /i, which is the ampli- 
tude for boundary-to-boundary propagation through the lattice. It serves to normalize 
the other correlators. In particular, Z j^f x{xq) / \/Ji ~ i?pg e"^^""^/^^ ^^^^ holds far from 
the boundaries, where Mps and Fps correspond to the pseudoscalar mass and decay 
constant IT5l. 
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Table 2: The autocorrelation times (in units of MD time) for dS/drj, the acceptance, 
the energy and reversibility violations for different choices of trajectory length. W is the 
window in MD time over which the autocorrelation time has been accumulated. From 
top to bottom: L = 8, 12 and 16, while La/ro=const. Runs in single precision. 

2.1 HMC in the pure gauge theory 

We start by investigating autocorrelation times for gluonic quantities in the pure-gauge 
HMC (system A), where we have three different lattice spacings at constant physical 
parameters (see Tab. HJ. The acceptance rates, as well as {6H'^), are given (the step- 
size is the same in all these runs); we are clearly in the regime of high acceptance Pa 
where {SH'^) — 27r(l — Pa)'^ HH]- The dependence of {SH"^) on r is moderate. Therefore 
the dependence of Pa is even weaker, but note that the effect of lower acceptance at 
larger r is automatically taken into account in the autocorrelation of observables. The 
reversibility violations as measured by {\dH\^) grow as ^/T here, or even more slowly; the 
reader is referred to the appendix for a more detailed discussion of {SH'^) and {\6H\^). 
We now focus on the observable dS/dr] defined in ^S]. We emphasize that, unlike the 
plaquette, this is an observable dominated by long-distance fluctuations and it typically 
has an autocorrelation time significantly larger than one. 

The most direct way to evaluate the performance of different r-choices for a given 
observable is to compare the corresponding normalized autocorrelation function, p{t). 
This is done on Fig. ^ for trajectory lengths 1/2, 1, 2, 4. There is a marked difference 
between, say, r = 1/2 and r = 2 in favour of the latter, at all MD-time separations. The 
choice r = 4 is slightly superior to r = 2 at short MD-time separation at the largest 
lattice spacing, while the opposite is true at the finest lattice spacing. Note in particular 
that at the shortest time separation, p is much smaller for r = 2 than for r < 1. In fact, 
the T = 2 autocorrelation function shows a significant non-monotonic behaviour, as we 
have observed previously only for hybrid overrelaxation algorithms. 
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Figure 1: Autocorrelation function of dS/drj in a pure gauge HMC (system A), for 
different trajectory lengths (from top to bottom: L = 8, 12 and 16). 
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Pure gauge theory L=0.7fm 
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Figure 2: The truncated autocorrelation time of dS/drj in the pure gauge system (A), 
as a function of the trajectory length r and for different lattice spacings. 



For a run of given MD-time length and measurement frequency, the error squared of 
the observable is proportional to 1/ Tint • When Tint is large compared to the measurement 
frequency, it corresponds to the area under the curves shown in Fig. ^ In practice, a 
window W must be chosen where to stop the summation. This may be chosen in a self- 
consistent way which balances statistical against systematic errors. We use by default 
the prescription described in However, to rate relative performances of algorithms, 
we also compute here an integrated autocorrelation time rtrunc where the summation is 
truncated at a fixed window W = 25. It turns out that typically 80% of the true Tint has 
by then been accumulated; the uncertainty on rtrunc is almost half of that on the full 
Tint and the hierarchy between the different trajectory lengths is at any rate maintained 
(Tab.©. 

Fig. El then illustrates the dependence of the truncated autocorrelation time on the 
trajectory length r. One sees that this quantity is minimized somewhere around r = 2, 
and this conclusion is largely independent of the lattice spacing in the range considered. 
Overall we see a variation by a factor two of Ttmnc for 1/2 < r < 4. Generally speaking 
this is a substantial variation which translates directly into a corresponding speed-up of 
simulations whose cost is dominated by the HMC. 

2.2 HMC in quenched QCD 

As our next observables we now consider fermionic correlators. Although we are ulti- 
mately interested in dynamical, large- volume simulations (system C), we first investigate 
the autocorrelation times in system B, which, roughly speaking, is a quenched version 
of system C with in addition the spatial extent divided by 3. The computing time is 
now dominated by the measurements. The run-length is tmn — 32000 in total (four 
independent lattices, called 'replica', were simulated). We investigate the range r = 1/2 
to r = 4, within which we see hardly any variation of the acceptance, and {\5H\^) scales 
roughly with y/r: Tab. 01 shows this, as well as some relevant autocorrelation times. 
The comparison of autocorrelation functions for fi and /p is done on Fig. |31 The 
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Figure 3: Autocorrelation function of the correlators fi and /p in quenched QCD (sys- 
tem B), for different trajectory lengths. 
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Table 3: The autocorrelations time (in units of MD time) of two primary observables 
(/i and /p) and of the effective pseudoscalar mass and decay constant in the middle of 
the lattice, the acceptance, the energy and reversibility violations for different choices 
of trajectory length. Runs in single precision. 
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Figure 4: Autocorrelation function of the correlators /i and /p in A'f = 2 QCD (system 
C), for different trajectory lengths. 



advantage of r = 4 and r = 2 over r = 1/2 is both substantial and statistically signifi- 
cant. This is reflected in a factor of about two in the integrated autocorrelation times. 
This gain is at least as marked in physical quantities such as the effective pseudoscalar 
mass and decay constant. 

2.3 Mass-preconditioned HMC in A^f = 2 QCD 

We now come to A'^f = 2 dynamical simulations in a spatial volume of (2fm)^, and con- 
sider the same fermionic correlators as in the previous section. The quark mass is around 
nis, the strange quark mass, and we have two values of r, 1/2 and 2. The total statistics 
in each case is about irun = 4000 (2 replica were simulated). 

Note that the acceptance is practically the same in both runs. The autocorrelation 
functions of /i and /p are compared on Fig. 0J Naturally the error bars are now larger; 
nonetheless a statistically significant advantage of the run at r = 2 is seen at MD- 
time separations up to about 30. This is confirmed when one looks at the truncated 
autocorrelation time with a window oi W = 50. The autocorrelation function of /i is 
in fact surprisingly similar with that in the quenched simulation, Fig. |31 The resulting 
autocorrelation times are given in Tab. HI In spite of large uncertainties, there is a 
significant reduction of 7"int[/pg(T/2)] when going from r = 1/2 to r = 2. 

2.3.1 Thermalization with light quarks 

Since we observe that autocorrelation times of long-distance observables are reduced by 
the use of longer trajectories, we also expect their thermalization to be accelerated by 
the latter. Consider the thermalization of system D (see Tab.^. The exercise here is 
to 

1. start from the ensemble with non-zero boundary field ('point A' of |16j^: 

2. revert to homogeneous Dirichlet boundary conditions, and hence vanishing back- 
ground field [TH] : 
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Table 4: The autocorrelations time (in units of MD time) of two primary observables 
(/i and /p) and of the effective pseudoscalar mass and decay constant in the middle of 
the lattice, the acceptance, the energy and reversibility violations for different choices 
of trajectory length. The value of {6H'^) / (L'^Tdr^) marked by a * was obtained for one 
of the replica; the other had one spike of 5H ~ 2000. Runs in double precision. 



Nf=2 therm. (BF=A->0), 12^, L=0.7fm, m=0. 0025(3) 




m=0.0025(3) 




Figure 5: Thermalization of two long-distance observables [dS/di] and /p — /p) having 
zero as expectation value. The starting configurations are taken from an ensemble where 
these expectation values are non-zero (system D). 



3. let the system rethermalize under the HMC evolution, for two different choices of 
trajectory length; 

4. track how fast quantities whose expectation values vanish by symmetry in the 
homogeneous Dirichlet case relax to zero. 

Fig. [5] shows the relaxation of the observables dS/dr] and /p — /p. The latter is the 
asymmetry between the correlator emanating from one boundary and the other. A data 
point at time t shows the value of the observable averaged over 16 independent replica 
(corresponding to independent starting configurations), and averaged over the MD time 
interval ]t — 6,t + 6]. The dS/dij measurements were done after every trajectory and 
/p — /p every other trajectory. 

The thermalization takes place faster with the choice of trajectory length r = 2, at 
least in the relatively early stages. In difficult simulations the time-consuming part is 
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presumably the tail of the thermalization, but it would require very large statistics to 
demonstrate an algorithmic advantage in that regime. 

3 Conclusion 

We have investigated the dependence of autocorrelation times on the HMC trajectory 
length, focusing on long-distance observables, in a variety of different physical situations. 
We find this dependence to be substantial (a factor around 2) and statistically significant. 
The reduced correlation of successive measurements, done at fixed intervals of molecular 
dynamics time, is most clearly seen in the autocorrelation functions themselves. This 
means that a reasonable tuning of the trajectory length may save a factor of about 2 in 
computing time. 

The optimal choice of r is observable dependent. However we have observed that 
trajectories longer than the ones conventionally used provide an advantage in computing 
standard physical quantities such as the pseudoscalar mass and decay constant. It will 
be interesting to see whether this conclusion also holds for HMC algorithms that use a 
different preconditioning of the pseudofermion action, for a different number of fiavours, 
etc. 

Naturally, other criteria are relevant in the final choice of trajectory length; the 
issues of stability and reversibility violations have been addressed in the appendix (see 
also Ref. [3 I12| I24j to name a few). We are currently performing a simulation at a 
quark mass of ms/2 with r = 2 that shows good stability and controlled reversibility 
violations. 
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Appendix: Stability and reversibility violations 

In this appendix we discuss some important issues that one might worry about if one 
increases the length of MD trajectories. Most of what follows is not new but we find it 
useful to gather here the relevant points. 

Energy violations along a trajectory 

Fig. El shows the variation of the Hamiltonian along one MD trajectory on lattice C, for 
two different step sizes (here the leap-frog integrator was used with a ratio of 5 between 
the time-steps of the two pseudo-fermion forces |^). The right plot illustrates the fact 
that the fiuctuations of 6H{t) around zero do not grow very fast with MD time t. 
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MD time MDtime 

Figure 6: Variation of the Hamiltonian along one MD trajectory on lattice C, in units 
of \/V dT^ ^ where V is the number of lattice points. 



Although a symplectic integrator such as leap-frog has discretization errors, there is 
a Hamiltonian which differs by 0((ir^) terms from the MD Hamiltonian, and which is 
exactly conserved by the MD evolution jl81 125j : 

i/o = H{t)+ dT^hl (t) + dT^h2 {t) + ... (3) 



Hence, for a given start configuration and momenta, 

6H{t) 



dr'^ 



6hi{t)+dT'5h2{t) + ... (4) 



In words, the curve 5H{t)/dT'^ is independent of dr, up to 0((ir^) corrections. This is 
clearly what is seen on the left plot of Fig. [Bl Note that since in equilibrium {5H) ~ 
1/2{6H^) = O(dT^) > holds, we have {5hi{t)) = and {6h2{t)) < for any fixed t. 



Scaling of energy violations 

The scaling law {6H'^) ~ Vdr^ was proposed in |18[ I19j. Indeed, since 6H is O(dr^) 
for one trajectory, 5H^ is O(dr^) and so is its average. As to the volume dependence, 
we confirm that {6H'^) depends mainly on the number of lattice points V [20], while 
the dependence on the coupling /? is very weak^. The values of {SH'^) in Tab. [U are 
consistent with {SH"^) cx y(^), where rj € [1.07, 1.11]. 

The dependence of {SH'^) on the trajectory length r is very moderate but follows no 
obvious formula (see Tab. E)). Nevertheless, the trend is reasonably well described by 
r", 0.3 < a < 0.4. 



Stability 

Of course the statements made in the previous paragraph hold for exact arithmetics, 
and in practice it must be checked empirically at what trajectory length rounding errors 
introduce instabilities in the integration ^2^- The other source of instability can be 

^Additional /3 — 6.086 simulations at L = 12 and 16 in system A with r — 2 give {S H'^) / (Ldr)'^ = 
1.31(2) and 1.36(6) respectively. 



10 



the occurrence of an exceptionally large force, which spoils the expansion in dr. In the 
simulation C (see Tab.^, \6H\ was bounded by 1 in the r = 2 run, while one replicum 
experienced one spike of 8H ~ 2000 in the r = 1/2 run (the simulation then continued 
normally) . 



Reversibility violations 

Reversibility violations in our simulations come from rounding errors, and from the non- 
zero residuals of the inversions in the MD evolution^ . These effects of course accumulate 
with the length of the trajectory. However Tab. 121 121 and 0] show that the increase of 
i\5H^\) with r in simulations where \6H\ is bounded by one and is typically much 
smaller, is rather slow, roughly like ~ ^/t. Therefore it should not present a serious 
problem. In the trajectory of Fig. with dr = 0.04, 10^ • {\8H\^) is 0.06, 0.3, 1.4 and 
1.4 for T = 0.08, 0.8, 1.6 and 8 respectively. 

How a small reversibility violation influences the ensemble generated is not known 
(to us); it may lead to an incorrect sampling. The influence on expectation values of 
observables is even harder to pin down. The study j2j showed that even with {\5H^\) = 
0.01 one does not see an effect in a number of observables computed at the sub-percent 
level. 
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